Robust model selection with flexible trimming

نویسندگان

  • Marco Riani
  • Anthony C. Atkinson
چکیده

We use the forward search to provide data-driven flexible trimming of the Cp statistic for the choice of regression models, thus revealing the effect of multiple outliers on model selection. We show, even in small sub-samples of the data, that is with heavy trimming, that the statistic virtually has a known null distribution a scaled and shifted F . We discuss the theoretical background to this form of distributional robustness. Our results on the null distribution provide a framework for informed robust model choice. Two examples of widely differing size are analysed. We develop a powerful graphical tool, the generalized candlestick plot, which summarises the information on forward searches and model choice for numerous models. A comparison of our method of flexible trimming is made with the use by Ronchetti and Staudte (1994) of M-estimation in robust model choice. The candlestick plot elucidates the properties of the two methods. In the related problem of the selection of non-nested models for Gaussian time series without regressors it is customary to use AIC. However, the distribution of this statistic is only known asymptotically. In order to provide informed model choice we accordingly derive a Cp-statistic, now with known asymptotic properties, which hold to an excellent approximation even in small samples. The statistic may be used for the choice of, for example, an ARMA time series model with Gaussian errors with, or without, regressor variables. The method can be applied not only to ARIMA models but also to the analysis of structural time series. As an example we analyse a time series of one hundred observations on the one-day ahead baseload price of electricity. A characteristic of such series is that they combine a stochastic process with explanatory variables. A complicated example is given by Koopman, Ooms, and Carnero (2007). We obtain a surprisingly rich understanding of potentially good and poor models. The known distribution of our statistic again greatly aids interpretation of our plots.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Robust portfolio selection with polyhedral ambiguous inputs

 Ambiguity in the inputs of the models is typical especially in portfolio selection problem where the true distribution of random variables is usually unknown. Here we use robust optimization approach to address the ambiguity in conditional-value-at-risk minimization model. We obtain explicit models of the robust conditional-value-at-risk minimization for polyhedral and correlated polyhedral am...

متن کامل

A robust multi-objective global supplier selection model under currency fluctuation and price discount

Robust supplier selection problem, in a scenario-based approach has been proposed, when the demand and exchange rates are subject to uncertainties. First, a deterministic multi-objective mixed integer linear programming is developed; then, the robust counterpart of the proposed mixed integer linear programming is presented using the recent extension in robust optimization theory. We discuss dec...

متن کامل

Primal and dual robust counterparts of uncertain linear programs: an application to portfolio selection

This paper proposes a family of robust counterpart for uncertain linear programs (LP) which is obtained for a general definition of the uncertainty region. The relationship between uncertainty sets using norm bod-ies and their corresponding robust counterparts defined by dual norms is presented. Those properties lead us to characterize primal and dual robust counterparts. The researchers show t...

متن کامل

Negative Selection Based Data Classification with Flexible Boundaries

One of the most important artificial immune algorithms is negative selection algorithm, which is an anomaly detection and pattern recognition technique; however, recent research has shown the successful application of this algorithm in data classification. Most of the negative selection methods consider deterministic boundaries to distinguish between self and non-self-spaces. In this paper, two...

متن کامل

A two-stage robust model for portfolio selection by using goal programming

In portfolio selection models, uncertainty plays an important role. The parameter’s uncertainty leads to getting away from optimal solution so it is needed to consider that in models. In this paper we presented a two-stage robust model that in first stage determines the desired percentage of investment in each industrial group by using return and risk measures from different industries. One rea...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Computational Statistics & Data Analysis

دوره 54  شماره 

صفحات  -

تاریخ انتشار 2010